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ABSTRACT 

For a given misfit junction, a specified optimality measure of a model, its gradient describes the 
manner in which one may alter properties of the system to march towards a stationary point. The 
adjoint method, arising from partial-differential-equation-constrained optimization, describes a means 
of extracting derivatives of a misfit function with respect to model parameters through finite com- 
putation. It relies on the accurate calculation of wavefields that are driven by two types of sources, 
namely the average wave-excitation spectrum, resulting in the forward wavefield, and differences be- 
tween predictions and observations, resulting in an adjoint wavefield. All sensitivity kernels relevant 
to a given measurement emerge directly from the evaluation of an interaction integral involving these 
wavefields. The technique facilitates computation of sensitivity kernels (Frechet derivatives) relative 
to three-dimensional heterogeneous background models, thereby paving the way for non-linear iter- 
ative inversions. An algorithm to perform such inversions using as many observations as desired is 
discussed. 



1. INTRODUCTION 

Diffuse seismic wavefields, present in a variety of media, such as stars and planets, may be created by the action 
of stochastic sources of wave excitation, where source location, amplitude and phase are random variables. Without 
knowledge of the exact realization of all relevant sources of wave excitation, raw time series of wavefield velocities 
contain no useful seismic information. However, it was discovered that seismically relevant data were contained in 
time- a ver ages over many source realizat ions of second-order correla tions of wavefield velocities in the Sun (jDuvall et al.l 
(|1993h : for noise tomography, see, e.g.. iShapiro fe Campillol (|2004D L These correlations contain components of noise, 
whose standard deviation diminishes as the inverse square root of the te mporal length of ave r aging , which is the 
consequence of wave excitation by a stationary random process (see, e.g., [Gizon fc Birchl [20021 l200l . In the bun, 
turbulent convection, driven by radiative thermal losses at the surface, is the cause of wave ge neration. This random 
process is adequately represented by a stationary and laterally homogeneous random process (jGizon fc Birchl l200l . 
Typically, the correlation time of solar convection (granulation) is 10 min and the correlation length is 1 Mm. While 
the correlation time is on the order of the wave period, the correlation length is smaller than the wavelength and thus 
the assumption of spatially uncorrelatcd sources is reasonable. The response of the Sun to excitation by turbulent 
c onvection produc es a p ower spectrum t hat p eaks near 3 mHz. 

IWoodardl (Il99l and I Gizon fe Birchl (2002) were among the first to utilize these ideas towards the construction 
of a theoretical description of h elioseismic mea s urements. A prescription to com pute sensitivity ke r nels and model 
excita tion noise was described bv | Gizon fe Bir ch (2002,^001). Various authors, e.g.. lBirch et"aH (f200l . lBirch fe Gizonl 
(|2007h and lJackiewicz et al.l (|2o6t1 ). subsequently used this theory to derive sensitivity kernels for flows and sound-speed 
perturbations for tran slationally-invariant (laterally homogeneous) background models. The basic recipe described in 
IGizon fe Birchl (|2002l ) to compute travel-time sensitivity kernels for randomly excited waves was general; however no 
attempt was made to relate this method to the adjoint method, which enables computation of kernels for heterogeneous 
background models using numerical wave sim ulation s. 

Th e adjoint method ha s a long history (jLionsI 119711 ) and is widely used in fluid contro l (e.g., iBewlev et al.1 
1 20011: iGiles fe Piercd 120001 and references therein), a irfoil optimization (e.g., Uamesor] Il988fl. meteorology (e.g., 
i LeDimet fc Talagrandlll986l ; iTalagrand fc Courtierlfl987|) and terrestrial seismology (e.g. jTarantoialll984l ; iTromp et al.1 
2005, 2010). Real- world optimization problems are typically functions of large numbers of parameters, ill posed and 
computationally expensive. For instance, one may envisage the difficulty in minimizing drag due to flow over an airfoil 
or seeking a model of Earth's interior that optimally fits observed seismograms, simply due to large number of ways 
one may alter the system. What parameters should one vary in order to achieve optimality? It is evident that the 
gradient of the misfit function with respect to various parameters tells us how to march towards a stationary point, 
i.e., a point at which the derivative of a quantity vanishes. The adjoint method provides an algorithm to compute 
Frechet derivatives and hence the gradient with relatively small computational expense. 

In this article we extend the adjoint method to the computation of helioseismic sensitivity kernels relative to arbi- 
trarily heterogeneous background models. The complexity of equations in the presence of strong lateral inhomogeneity 
is such that evaluation of kernels must proceed by computational means. A remarkable outcome of allowing for lateral 
(horizontal) variations in the background model is the ability to compute vector kernels for magnetic fields. Since 
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the perturbation induced by fields scales as (3(|B| 2 ), where B is the background field, and the action of the Lorentz 
force is anisotropic, it is non-trivial to derive magnetic field kernels about a ID solar model. However, if we were to 
linearize around a 3D background model that contains an embedded field B, kernels describing shifts in helioseismic 
measurements due to small vector variations in the magnetic field emerge naturally. 

Evaluating kernels in the context of helioseismology requires the computation of six wavefields per measurement. 
Losing the luxury of being able to translate kernels from one horizontal position to another therefore comes at a 
stiff computational price, since one must, in principle, evaluate kernels around each observational pixel, an impossible 
feat i n helioseismology owing to the vast numbers of observations. Michelson Doppler Imager (MDI; iScherrer et al.l 
1995) records velocities at approximately 1 million points on the solar photosphere, and with the advent of Solar 
Dynamics Observatory (SDO), Helioseismic and Magnetic Imager (HMI) now captures velocities at more than 16 
million pixels every 45 seconds. Computing kernels at all these points is neither computationally feasible nor is 
it clear that there is sufficient independent information to require such a massive calculation. Consequently, we 
introduce the concept of "master pixels" , a finite constellation of points which we consider interesting enough to invest 
this sizeable computational effort. However, once a number of these pixels have been chosen, every cross correlation 
meas urement, one of wh ose antennae is a master pixel, may be utilized in the inversion without affecting computational 
cost (jTromp et alJl20ldh . 

In this article, we shall primarily discuss the mathematical underpinnings of the adjoint method and its applica- 
bility to helioseismology. A computational algorithm to implement the analysis is described. Perturbations, such as 
sunspots, are significant deviations from the quiet Sun and shifts in helioseismic measurements in and around sunspots 
are substantial and unlikely to scale linearly with perturbation strength (when measured relative to the quiet Sun) . A 
means of carrying out iterative inversions in such situations is described. With the increasing availability of compu- 
tational resources, demand for greater accuracy in the interpretation of helioseismic measurements and the advent of 
higher-quality observations, the introduction of such a technique is thought to be timely. 

2. GOVERNING EQUATIONS OF THE HELIOSEISMIC WAVEFIELD 
We start with a background state in magneto-static equilibrium, described by: 

Vp = pg + (VxB)xB, (1) 

V-g = -4nGp, (2) 

where p is the background pressure, p the density, B the ma gnetic field, g = — qe r gravity, G the universal gravitational 
constant, and e r th e radially outward unit vector (e.g., ILynden-Bcll fc Ostrikerl [T967t iGoedbloed fc Poedtsl 120041 : 
ICameron et al J 120071 ). In this formalism, we consider background flows (v) to be too weak to contribute significantly 
towards maintaining equilibrium, and hence we neglect advection- related forces in equation (JXJ) - This implies that 
||v|| <C \/gL, where L is the characteristic flow length scale, and that Lorentz forces are primarily balanced by 
pressure gradients and gravity. However any flows that are present must satisfy the continuity equation, and we 
require therefore that V • (pv) = 0. The magnetic permeability constant Att/iq has been absorbed into the definition of 
the field. We do not keep rotation terms in the force balance equation because Coriolis and centrifugal forces are five 
orders in magnitude smaller than surface gravity. We also invoke the Cowling approximation, allowing us to ignore 
changes in the gravitational potential induced by wave motions. Small-amplitude wave propagation in a magnetic 
environment is described by the following dynamical wave operator (in temporal Fourier space; see Appendix |D] for 
the convention) 

= -u 2 p£ - 2iupv ■ V£ - iojpTi - V{c 2 P V ■ £) - V(£ • Vp) + gV • (p$) 
-(VxB)x[Vx(£xB)] -{Vx[Vx(£xB)]}xB, (3) 

where £ is the displacement vector and c is the background sound speed. In order, terms on the right side denote 
acceleration (first term), flow advection, wave damping, pressure restoring forces (the term with c 2 p), buoyancy (the 
next two terms) and magnetic Lorentz force (the final two) respectively. We assume that the upper boundary is placed 
far away from the solar photosphere and the wavefield satisfies zero-Dirichlct conditions ( all fluctuations are zer o on 
this bounding sur face). The entire solar interior is enclosed within this volume. Following iGizon fc Birchl (|2002T ) and 
IBirch et al.l (l200l . we mimic the complex frequency dependence of wave damping in the Sun by including the term 
— iu>T£, where F is the damping rate. We neglect second-order flow terms such as v ■ V(v • V£); this is a reasonable 
approximation when the velocities roughly satisfy ||v|| <C {wl,c, \/gL}, where uj is the characteristic wave frequency. 
It may be verified that this inequality is satisfied for most solar phenomena (e.g.. IBirch fc Gizo n 2007). The full wave 
equation is given by C£ = S, where S(x, lj) is a source term. 

Flows and damping do not follow directly from the equilibrium equations. The emergence of wave da mping is not 
well understood, and is thou ght to be due to a combination of the action of turbulence and radiation (e.g.. lDuvall et all 
1998; IKorzennik et~alll200l : we are unable to realistically account for these phenomena and are therefore forced to 
introduce phenomenological damping terms. Solar flows, as discussed previously, are typically weak perturbations to 
the backg round. Further, co nstructing a background model with flows and magnetic fields is a remarkably difficult 
task fe.g.. iBelien et al1l2002h . Such practical considerations have led us to introduce these terms in an ad-hoc fashion. 



3. MINIMIZING MISFIT 
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A common optimization problem in helioseismology is that of reducing differences between observed and predicted 
travel times. Cross-correlation amplitudes, which depend quasi-linearly on properties of the background model, are 
commonly measured but not typically used in inversions; conceptually, one may in clude these in the misfit with no 
additional effort (e.g., iGee k Jord"anlll992t iFichtner et"aLll2008t iBozdag et alll2011h . 

A convenient choice for the misfit function is the L 2 norm of these differences summed over a number of observation 
points 

? = \Y.^A^ n) -t°][t^ -r°,l (4) 
1,1' 

where r° is the observed travel time, r^™' the predicted analog with (current) background model n, specified at points 
q,q', and A fq a > the inverse of the noise covariance between the two sets of measurements, assumed to be chi-squared 
distributed (jGizon k Birchll2004D . Here the noise- covariance model is assumed to be stationary under changes of the 



background model, i.e., M qq ' does not change with iteration. Partial-differential-equation constrained optimization is 
the technique of minimizing this misfit with respect to a governing wave equation, 

1 = \ E^' H n) - ^ ~ - [ dx[du,\-(Ct-8), (5) 

q,q> 

where A is a Lagrange mul t iplier and the integration proceeds over all space x and frequency uj. As, e.g., IWoodardl 
(|1997f ) and iGizon k Birch! (|2002f ) realized, a first step towards formal interpretation of measurements is to create 
functionals linking cross correlations and travel times to the input displacement field, i.e., to establish a relation of 

the form Tq = Tq n \^). For now, we choose to represent this in an abstract fashion, and in subsequent sections move 
towards greater detail. Let us posit that a change in misfit (fj| may be written as 

SI' = J dxj du f f • ££, (6) 

where ft is a function that connects variations in displacement field <5£, to those of travel-time misfit SI' . Now changes 
in the misfit associated with the constrained problem ([5]) may be written as 

51 = Jdx J dwft •££- J dx J du[5\-( L C$-S) + \-5£$ + \-C6$], (7) 

upon invoking ([6]) and setting SS — 0. If the forward displacement field were to satisfy ££ = S, and we were able 
to eliminate terms involving <5£, then changes in the misfit would be functions only of A, £ and the perturbed wave 
operator, which depends only on background properties such as sound speed, magnetic fields, density, etc. Now in 
order to accomplish this, we need to first be able to free <5£ from the action of the operator in the third term of 
equation |7]l. The property of adjointness or duality is central to such a manipulation. An operator O is said to be 
self-adjoint if it satisfies 



dx A • <D£ = / dx £ ■ OX. (8) 

© Jq 

For the boundary conditions chosen here, it may be demonstrated that the ideal MHD operator, which contains no 
flow or dissipation terms, is an example (e.g.. iGoedbloed k Po edts 2004). However, the non-ideal operator (J3J) is not 
self-adjoint and obeys 

[ dxX C£= [ dxf£ f A, (9) 

Jq Jq 

where £ , defined as adjoint to (J3]), is given by (see appendix [A]) 

£t£ = -uj 2 p$, - iujpTg, + 2iujpv ■ V£ - V(c 2 pV • £ + £ ■ Vp) + gV • (p£) 

- [(VxB)x{Vx(£xB)} + {Vx[Vx(£xB)]}xB]. (10) 

The only difference between operators ([3]) and (flQ|) is that of a reversal in sign of the background flow term (v flips 
sign). Thus the following term may be rearranged such that 



dx du; X- CS£= / dx / dw <5£ • r T A, (11) 
J Jq J 

where C\ the adjoint (or dual) operator, acts on Lagrange multiplier A and 5£ has been effectively freed. Now, we 
choose A so as to satisfy the differential equation 

£^-^ = 0, (12) 
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leaving an elegant and simple connection between the variation in misfit and model parameters: 

51=- I dx [ duj A • 8££. (13) 



Since C depends solely on background properties (denoted collectively as {/3 S }), variations in the operator may be 
represented as effective functions of 8j3 s 

\.6CS=(x i ^£ j ) 6$ s , (14) 



df3 s 

where Einstein's summation convention is employed and C = {Cij} is a second-order tensor. Properties j3 s in this 
theory are regarded as being functions only of space; this allows us to define a kernel as 

K s = -JdLo (a,^^). (15) 

leading to 

81= [ dx J2K S 5!3 S , (16) 

which tells us how to simultaneously solve the inverse problem for all relevant helioseismic quantities: 

51= [ dx (K p Sp + K C 2 Sc 2 + K v • v + K B • SB) . (17) 

Jq 

Note we have not written out a kernel for pressure since it may be determined by considering variations in the 
equilibrium equation (JTJ. For a more detailed treatment, please see section [SJ equation ([TO)) . When performing an 
iterative inversion, it is evident from equation (|T6|) that by choosing 8j3 s = —e s K s , where e s > is a small constant, 
we arrive at, 

61=- I dx V e s K 2 s < 0. (18) 
Jq 

This is the principle of the steepest descent method. More sophisticated inverse algorithms such as the conjugate- 
gradient method, which uses previous and current gradients to construct the model update at a given iteration 
level, may be more relevant. Pre-conditioning, a technique applied to improve the condition number, may also be 
implemented. The determination of e is also non-trivial, requiring a "line search" to determine an optimal value 
(whereas a crude way is to simply set it to some small value, such as 0.02). Thus, we alter the background state by 
amounts directly proportional to the Frechet derivative, i.e., we perform the following updates: 

c 2 — > c 2 — e c K c 2 , 
p^p-e p K p , 
v —> v — e v K v , 

B — ► B — £b Kb- (19) 

The preceding set of equations describes in generality how to pose the helioseismic inverse problem; translational 
invariance is a specific case of this formalism. 

4. MEASUREMENT FUNCTIONALS 

Thus far, we have very generally described the underpinnings of the adjoint method; from this point on, we focus on 
the primary measurement in time-distance helioseismology: cross correlations. Since we are interested in determining 
the gradient of the misfit function based on travel times (Eq. [4]), we must both appreciate how travel times are 
computed and quantify their variation with respect to changes in model parameters. Varying equation (jj), we have 



SI' = [^Sr q + Ar(") St*], (20) 
1,1' 

SZ = Y,\W«t +^ q )Ar^Sr q7 (21) 

9,9' 

= E 6 9 n) ^> ( 22 ) 

9 

^E^'+AWAr^, (23) 
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where Aig = [Tq — t°]. We do not place the iteration superscript n over the variation in travel time Sr q because 
this term implicitly depends on the background, w hich evolves with each iteration. We apply the following definition 
of travel time (appendix A of iGizon fc B irch 2003) 

St= f dt' W aP (t') 8C af} {t'), (24) 



where W a p is a weight function and 5C a/ 3(t') the devia tion in the cross c orre lation, a, are measur ement pixel locations, 
and T is the length of the temporal window. Following lWoodardl (|1997| ) and lGizon fc Birchl (|2002f ). we begin by defining 
the cross correlation 

C a p{t) = ^ / <£(W) cf>(xp,t + t') dt', (25) 
1 Jo 

where cf){x, t) is the line-of-sight projected wave velocity measured at spatial point x at the solar photosphere. Appro- 
priate filters and point-spread-function contributions are assumed to have already been incorporated into the definition 
of <j)(x,t). Transformed into temporal Fourier space, this becomes 

Cap = 7p 0*(Xq, U))(j>(xp,(j). (26) 

Let Green's tensor for the system of differential equations, denoted by G(x, x',w), satisfy 

CG = 6{x - x') I, (27) 

where x is termed the "receiver" and x', "the source", and I = {<%}■ Similarly, we define the adjoint Green's tensor 
via 

£ t G t = 6(x - x') I. (28) 
Thus for an arbitrary source distribution S(x',uj), the wavefield in temporal Fourier domain is given by 

£(x,w)= f dx' G(x,x',w) • S(x',cj), (29) 

J(7) 



and in time domain, 



£(x, t) = J dx' J dt' G(x, x', t-t')- S(x', t'). (30) 



Similar relations apply to the adjoint wavefield. In order to reduce notational burden, we discontinue explicitly writing 
the uj dependence, i.e., only source and receiver locations will be included when stating Green's function. In analyses 
that follow, we shall repeatedly switch positions of the source and receiver. Green's functions in the case of a switched 
source-receiver pair satisfies the following reciprocity relation (see appendix IA^) 

G t (x',x) = G T (x,x'). (31) 

Observations are typically highly processed versions of the raw sol ar vector velocity field, subjected to point spreading 
and phase-speed filtering, line-of-sight projection, etc. Following IGizon fc Birchl (|2002[ ). we introduce vector Qj to 
denote Green's function for the filtered, line-of-sight projected velocity 

£?i(x,x') = T(x,uj) * J i (x)G y (x,x / ), (32) 

where the convolution is spatio-temporal, 1 = {li(x)} is the unit line-of-sight projection vector, and .F(x, w) contains 
all filter terms and the transformation between displacement and observed wavefield velocity. Applying equation (|31[) 
to ([32j) . we may define the reciprocal filtered Green's function 

G}(x',x) = T(x,lu) * f i (x)Gt i (x' J x). (33) 

Note that Qj(x,x') = Cfj(x',x), since all we do is to replace Gy(x,x') by its adjoint counterpart G^x'jx), to which 
it is identically equal. 

The cross correlation written in terms of Green's tensors, driven by the source <Sfc(x, w), where k is the direction of 
the dipolar source, is 

C a0 = )= [ dx' f dx" G*(x a ,x') gj{x p ,x") S*(x',uj) S,-(x",w). (34) 

Measured cross correlations are typically averaged over a large number of source-correlation times, allowing us to 
treat it as an ensemble average over many source realizations. In other words, we consider a limit cross correlation 
that has detached itself from detailed properties of source action and is sensi tive only to the statistical quantity 
(S*(x',uj) Sj(x",uj)}, where the angled brackets denote ensemble averaging (e.g.. lWoodardlfT997HGizon fc Birchll2002l : 
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iTromp eF al. 2010). In order to render this theory computable, we explicitly assume that sources at disparate spatial 
points are spatially uncorrelated, allowing us to write 

(S*(x',a,) Si(x",u;)) = <5(x' - x") ^(x', w ), (35) 

where Vij encapsulates the average temporal power spectrum, correlations between different dipole sources and the 
spatial distribution of source amplitudes. Thus the limit cross correlation becomes 



(Cap) = fj Q dx ' #(*a> x ') 9j{*P,*) 

3 correlation 

(SC a p) = ^J dx ' ^ > x ') 6Q i ^ > x ') + SQ t ( x « . x ') Sj ( x ^ > x ')l n 



Consider a variation in the cross correlation 



(36) 



(37) 



where we have chosen to neglect changes in properties of the power spectrum, i.e., SViji'x! ,u) = 0. We invoke the 
first-Born approximation to describe variations in Green's tensor due to changes in properties of the background 
medium 

C SG = -SC G. (38) 
Using Green's identity, we recover the following expression for SGij (x, x') 



SGijfax!) = - [ dx" G ifc (x,x") [SC G(x",x')] fc ,, 
where the spatial coordinate in SC is x". Finally, we have 

<^(x,x') = JF(x,u;) * [k SG^] = - [ dx" fe (x,x") [SC G(x",x')] fe ,, 



(39) 



(40) 



where SC is a function of x" and the filter .F(x, w) acts only on £j(x) Gj/-(x, x"). Considering only the first term in 
the variation of the cross correlation in equation (|37p . we have 



(Kp) = — Jjbcjdx! Pi &(*/?.*)] [SC G(x,x')] fcj V 

Rearranging the integration order, 

(&cU)=-h I to^e, 



- I dx ^(x,X,3j 



, SCkp 


[ dx' 


Gpj (x, X 








,SCkp 


[ dx' 


Gp 3 (x, x 




-J o 





(41) 

(42) 
(43) 



because (^(x^x) = £/t(x, x^) (from Eqs. [35] and [33]) and Vij(uS) is real valued. Recalling equation (|2~4"]) . and 
transforming to the temporal Fourier domain, we obtain 



St 



2tt 



Now, substituting equation (f44|) into the expression for the misfit (Eq. [5] and Eq. [22]), we obtain 
fci = -J2^f J Q dx J duj W U^) b { q n) ^(x, X/J ) { 5£ fcp £ dx' G w (x, x') (ft (x Q , x') Vi 



(44) 



(45) 



where some bijective mapping function connects q to the cross-correlation points (a, /3). We define the adjoint field to 
be 

*t^(x) = e t (x,x^)^(o;)6W ! (46) 

where subscript /c has been dropped from the right side. It is important to note that observations have been assimilated 
into the adjoint field at this stage; thus, kernels that emerge will be functions of measurements. A subtlety in 
implementation arises due to the fact that the filter that takes the raw Green's function to the observable is actually 
applied on the second spatial index, x^. Here we explicitly specify this term 



Gti^p) = [F* (kGl)]]^ = ( dx' GL(x,x') [k Ffo - x',u;)] , 

J © 

where we have assumed a laterally-invariant filter. We arrive at the following adjoint wavefield 

*t a 0(x) = f dx' Gt(x,x') • M{x,uj). 

Jo 



(47) 



(48) 



The time-domain representation of this field is 

& a0 (x,t) = J dx' J dt' G^x.x', *-*')• M(x',t'), 

where M. is a vector whose components are given by 

Mi(x,w) = W* fi {u) [ k _ x , «)] . 



(49) 



(50) 



The forward field represents correlations of the wavefield between every point in the domain and the observed pixel 
a, and is calculated in a two-step approach (because of the presence of two Green's functions). First we compute the 
filtered wavefield response to the temporal spectrum of excitation applied at point a 



r](x,w) = [ dx G f (x,x') ■ T>, 
Jq 

where using equations (j47]) and (|50"|) . we define the source 

Vj(x,x',uj) = J"(x Q - x',ui) kVij(x,u>) 

In time domain this equation is 

r){x,t)= [ dx [ dt' G i (x,x',t-t')-T>(x,x',t'). 
Jq Jo 

This response in reverse time is applied as a source again, leading to the forward wavefield 

# a (x)= / dx' G(x,x')- V *(x',uj), 

Jq 

whose time-domain representation is given by 

S a (x,i)= / dx' j dt' G(x,x',t-t') -ri(x',-t'). 
Jq Jo 

We arrive at the following interaction integral 

fiii = - ^2 yy I d * I duj * f "/3 • ( sc *«)• 

The second contribution to the variation in misfit may be written as 

^ = - E 2^T J Q dx J dUJ W ^ {UJ) b ^ ^* (X ' Xq) {^* / Q dx ' G « (X ' X ' } ^^' X ') ^ X '' } k ' 



(51) 



(52) 



(53) 



(54) 



(55) 



(56) 



(57) 



and since all of these functions have purely real temporal representations, integration over frequency allows us to use 
the relation 51% = 8I2 , whereby 



Sl 2 = -J2^ J^dx Jdu W aP {uj) Gt(x,x a ) |j£ J^dx' G pj (x,x') (gj(x',xp) V^x', 



, (58) 



which resembles equation (|45p. except for the adjoint source now being slightly different and with adjoint and source 
points exchanged. The algorithm for computing this second term remains unchanged from that required for the first 
contribution. The total misfit variation is given by 



51 = - E T^f I dxf duj #t a/J . (SC # a ) + & Pa . ye 
a,/3 27rJ 70 J 



(59) 



where wavefields and corresponding sources are read off from the two misfit contributions, dli (Eq. |45| ) and 8T2 
(Eq. [55] ) ■ In summary, we have deconstructed the meaning of the quantity "travel time" , and expressed it in terms 
of primitive wavefield descriptors such as Green's functions and sources. Next, we studied its variation with respect 
to small perturbations to the wave operator - the first step in determining the Frechet derivative. Having quantified 
its variation, we decomposed the Frechet derivative into two constituent wavefields, whose convolution, mediated by 
an operator, reduces to the sensitivity kernel for that parameter. 



5. COMPUTING SENSITIVITY KERNELS 

With suitable notation and mathematics in place, we now describe convolution relations between forward and adjoint 
wavehelds which give sensitivity kernels for various model parameters, such as background flows, sound speed, density 
and magnetic fields. The latter two, in addition to the equilibrium equation, determine the corresponding variation 
in pressure. We begin with flow kernels; changes in isolation to the flow operator are written as 6C — —2iujpw ■ V. 
Substituting this into equation (|56p . we obtain 

§Xx =2i^2 I dx I duj U P * f «/3 ' ( v ' V)* Q 

dxv-K* 1 ', (60) 

JQ 

where, 

K«(x) - 2i P Y, [<b> w (V*a) • *V (61) 

Alternately, written in time domain, the flow sensitivity kernel becomes 

K«(x) =-2j2^[dt p[Vft* a (t)] • & a p(-t), (62) 

a.j3 

where for sake of convenience, we do not explicitly state the x dependence of the two fields. If we were to compute 
forward and adjoint fields based on equations (|54|l and (|46|). then interaction (|62[) between forward and time-reversed 
adjoint fields gives us the desired sensitivity kernel. The second contribution to misfit (and therefore the kernel) must 
be computed and added to equation ([62]). i.e., 



K V = KW+Ki 2 \ (63) 

where 

Ki 2 >(x) = -2j2^fdtp (Vd t <M*)) ' *W"*)- (64) 
Next, we consider perturbations to sound speed, 5C = — V(p£c 2 V-). Substituting this in equation (1551) , we have 



611 = E Yt I dx / duj * ta/3 ' v{pSc2 v ' * a) (65) 

= Y,Yt I dx fdujV- ( P 6c 2 & a0 V ■ # a ) - pfc 2 V • *t Q/3 V • * a . (66) 

The first term reduces to a surface integral at the domain boundaries and may therefore be dropped (having assumed 
homogeneous boundary conditions as x — » oo). The sound-speed kernel reduces to 

5X X = [ dx Sine 2 K ( c l\ (67) 



K £\x) = - P c 2 J2^f J dw V • #t aj8 v # a . (68) 



a,0 

Alternately, in time domain, the sound-speed kernel is obtained upon computing 

^W(x) = ~pc 2 £l /dtV- *t^(_ t ) V • *„(<). (69) 

In order to derive kernel expressions for magnetic field, density and pressure, which are additionally constrained by 
the equilibrium equation, we consider small perturbations around ([TJ, namely 

V8p = 5pg + p5g + (VxSB)xB + (VxB)xSB. (70) 

We may ignore perturbations to the gravitational field arising from surface phenomena, such as sunspots or flows in 
the convection zone, because an overwhelming fraction of solar mass is concentrated within the radiative interior. We 



have 

51 1 = 1 ± f j dx& a p -V{* a -V5p), (71) 

= ~^fj d * V-*t a/3 $ Q . VS p, (72) 

= - ^ j dx V • *t Q ^ Q . [ Sp g + ( V x SB) x B + (V x B) x SB]. (73) 

Terms involving density and magnetic field in the misfit expression for pressure (Eq. [75]) are absorbed into kernel 
expressions for the former two quantities respectively. The density kernel follows 

Sli = I dx K p {1) 5p (74) 
Jq 

k 'p 1) = - J2 yt I doJ ^ * ta/3 ' * a ~ lLoT * ta/3 ' * q + ° 2 v ' ^ afi v ' * q 

ct,P n 

- * Q Vg St^ - g . (* a . V*t Q/3 + $ qV • #t aj8 )] , (75) 

which in time domain is 

= / * {* t ^H)-5 t 2 * Q (t) + * t Q/3 (-0-[r*3 t *aW]+ c 2 v-*t Q ^_ i)v .* Q (i) 

a,/3 

- S a (i) • Vg • *t Qj8 (_ t ) _ g . [# a ( t ) . V*t o/J (_ t ) + $ Q (<)V • #t Qj8 (_ t )]} (76) 

This is the same as the kernel expression in, e.g., equation (55) of iLiu fc Tromd (120081) with t h eir ro tation terms and 
gravitational potential variations (their tp, 5(f), and Sgo respectively) set to zero. iZhu et all (|2009l ). who term this 
the impedance kernel {K' p ), showed that it is primarily sensitive to reflection zones and discontinuities. It therefore 
remains to be seen whether much information about density variations in the solar interior may be extracted from the 
wavefield. 

Dropping the assumption of translation invariance allows us to derive simple vector expressions for magnetic field 
kernels (see appendix [B]) . The effect of small deviations from a background field on travel times is given by the 
following expression 

SL X = [ dx k4 1} ■ SB, (77) 

Jq 

k b =Z)rr I di ° Vx [ Vx (*« xB ) x * t «/3] + {Vx[*t Q/?x (VxB)]} X* Q 
+*t^x{Vx[Vx(*«xB)]} + * Q x{Vx[Vx(*t^ X B)]} 

+Vx[Bx(* Q V ■ *t Q/3 )] + V • *t Q/3 $ Q x[VxB], (78) 

which in time domain is 

K B )= 5^ / dt VX[VX(# a (f)XB)X*t a/J (_i)] + {VX[#t aj8 (_t) X (VXB)]} X# a (t) 

+*t«^(-i)x{Vx[Vx($ a (t)xB)]} + * a (f)x{Vx[Vx(#t^(-t)xB)]} 

+Vx[Bx($ a (()V ■ &an(-t))] + V • $^H)$ a (i)x[VxB], (79) 

For inversions constrained by equilibrium equation ([1]), we have arrived at kernels for four independent model param- 
eters, namely: sound speed, velocity, density, and magnetic field. 

What is the connection between kernels derived here and those computed by e.g., [Birch et al.l ((2004)? Translation 
invariance implies that everywhere in the domain of interest, differences between predicted and measured travel times 
are small and that perturbations to the background state are weak. Dividing out the Ar 9 term (setting it to some 
constant value) from expressions for the kernel and misfit, we arrive at the classical linear helioseismic forward problem 

St= [ dxK(x)-dq(x), (80) 

Jq 

where Sq is a perturbation of interest. 
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6. PERFORMING INVERSIONS, COMPUTATIONAL ALGORITHM, & COST 

In this section, we describe how the adjoint technique may be applied efficiently to perf orm large-scale inversion s 
using helioseism ic data. We begin w i th the concept of an event kernel, discussed in e.g., iBamberger et al.l (|1982f) ; 
llgel et al.l (|T996l ): lTromp et all (|200ll ; iTape et all pOOH) . the focus of the inverse procedure. Consider a seismic event 
(i.e., a source at) a whose signature is recorded at some N locations. In a computational sense, the predicted wavefield 
generated by this source event is encoded in the forward wavefield, while observations are assimilated into the adjoint 
wavefield. The expense involved in computing event kernels scales linearly with number of sources a, independently of 
the number of observation locations. This is because (as will be shown here) all N observations may simultaneously be 
injected at corresponding station locations to produce the adjoint field. In the translationally-invariant helioseismology 
case, the event kernel may be obtained by summing up N appropriately rotated and translated kernels, each weighted 
by the relevant travel-time shift 

N 

if a (x) = ^AT tt ^ a f,(x). (81) 
0=1 

We formulate algorithm ic details associated with incorporating large numbers of observations into the inversion (see 
also lTromp et al J 12010). Let us choose M master pixels, which are correlated with signals measured at N pixels, i.e., 
MN + M(M — l)/2 correlations in total, a number that scales as 0(MN) since M <C N . The associated misfit may 
be written as 

N M r . 

SI = - X X 2~r / ^ / ^ ^ af> ' SC * a + * t/3 " ' 6C *^ ' (82) 

We attempt to moderate computational cost by absorbing the summation over N into forward and adjoint sources, 
suitably redefined. The first contribution to misfit may be rewritten as 



N M ., „ „ M 



Sli = - — — f dx [ dui • SC&a = - — — f dx / doj $t Q ■ 5C& a , 



(83) 



0=1 

where the adjoint source and wavefield are given by 

N 

M i (x) = i i ^J-( X ^-x,a;) W* b n q , (84) 

iS=l 

*t ( x )= J dx' G{x,x')-M(x',u;), (85) 
Jq 



and the forward wavefield is as stated in equation ([54]) . All N cross correlations of slave pixels with master-pixel a 
are subsumed into one adjoint calculation. Computationally, this is accomplished by constructing adjoint source flM} 



as a sum over all slave pixels that are correlated with a. The partial event kernel may be computed using the 
wavefields in equation (|83[) together with kernel expressions stated in the preceding section. The second contribution 
requires some manipulation and redefinitions, namely 

Sl 2 = - X X / d *J duJ *V ' SC *P = -H 2^fl dx j duJ ^ a ' * jC *°' (86) 

j3= 1 a — 1 ® a— 1 ® 

M l (x) = l t T(x a -x,w), (87) 

N 

X 

0=1 



#t a (x)= / dx' G(x,x')- M{x',w), (88) 
Jq 

N 

D a (x,x', W ) = V^^i ) W T{x' -xp)\-V{x,u>), (89) 
p=i 

f) a (x,u)= [ dx' Gt(x,x')- T> a (x,x',w), (90) 
Jq 



* Q = / dx' G(x,x')-*7 Q (x',w). (91) 



The second contribution is constructed by interacting the two wavefields according to equation (155)) and added to Ka 
to complete the calculation of the full event kernel. Thus the vast number of observations of the solar wavefield may 
all be assimilated into the inversion but with a finite 0{M) number of calculations. The M master pixels may be 
chosen to ensure greatest coverage within the region of interest, whose locations could be decided by criteria such as 
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maximizing the sum of distances between point pairs. The algorithm, depending on whether sensitivity kernels are 
being computed or inversions are performed may be stated in the following manner: 

• Master Pixels: Choose a set of M master (a) and N slave (f3) pixels with M<JV. For instance, a constellation 
of points surrounding a sunspot or active region. 

• Intermediate Wavefield (r/, fj): If the intent is to compute kernels, source (f52"j) is applied (Eq. [5T]) and the result- 
ing wavefield is saved at all points where the wave excitation source is non-zero. Alternately, when performing 
inversions, two types of sources, given by (1521 and (|59"|) . must be applied. Because this wavefield is used to 
drive the forward simulation, it must be saved at a sufficient number of temporal points. This does not demand 
large storage requirements since only 2D slices are written out (for all practical purposes, wave excitation occurs 
at one depth). The driving source for the calculation of a sensitivity kern el is given by (|5T|) and for the event 
kernel (|89p . This is termed the generating wavefield bv lTromp et al.l (|2010f ) . 

• Forward wavefield (4>,<I>): Driven by the time-reversed intermediate wavefield displacement (r},fj) injected at 
the nominal excitation depth, with the specific choice of sources dependent on whether an event kernel (|90[) 
or a sensitivity kernel (|54| is being computed. The 3D wavefield is saved at a cadence of 30 s econds (Nyquist 
frequency of 16.66 mHz). This is termed the ensemble forward wavefield bv lTromp et alj (|2010l ). 

• Adjoint Source (At): The time- history of the forward wavefield extracted at the observation height is filtered 
according to equation (|32p and time series at all slave pixels are isolated. These form the predicted limit cross 
correlations for those point pairs. We now determine the adjoint source according to equations (|50|) . (f84|) . or (|87|) 
as the case may be (i.e., computing kernels between a point pair or performing an inversion using large numbers 
of observations). Note this is the stage where observations are assimilated into the inversion. 

• Adjoint Wavefield & Partial Kernels (<I?t , 4>'): The former is evaluated according to equations (|46|) . (|85|) . or (|8gp 
as the case may be. The 3D adjoint wavefield is saved at the same cadence as that of the forward. We may then 
compute kernels according to interaction integral (I56p . Each sensitivity or event kernel has two contributions 
which must be added together. 

• Temporal length & Computational domain size: Simulations must be run for at least as long as it takes for waves 
to arrive from the farthest contributing source to the observation points. The farther the source is, the greater 
the effects of damping and geometric spreading and thus the contribution of a source diminishes with distance 
from observation points. 

• Storage Cadence: Five to ten points per temporal wavelength is a reasonable rule of thumb. This is done in order 
to maximize the accuracy in evaluating the interaction integral (|56l) while not plac ing unnecessary dem ands on 
storage. Of course, this step may be obviated if one were to apply the algorithm of lLiu fc Trompl (|2008f ). 

• Boundary conditions: Highly- absorbent boundary conditions are recommended in order that waves that have 
propagated out do not return to the region of interest. 

For a fixed resolution and temporal extent of the calculation, both storage and computational expense increase 
linearly with the number of master pixels, i.e., computations scale as 0(5M) Green's function calculations, where M 
is the number of master pixels. This is because the intermediate wavefield needs only be computed for a time extent 
T/2 whereas the forward and adjoint wavefields must be calculated over a temporal length T. Storage cost scales 
approximately as 0(4M). The power of this technique is twofold, firstly in being able to compute all kernels relevant 
to a given measurement simultaneously from the adjoint and forward wavefields, and secondly, in assimilating as many 
observations as desired in order to perform the inversion. 

Rapid convergence, i.e., reduction in misfit, is a desirable quality of an inverse technique. Two well-known drawbacks 
of the steepest descent method are that it converges very slowly for problems where the condition number is large 
and the convergence rate is very sensitive to the local step-size (e in Eq. |19|). A much more popular and powerful 
technique is the conjugate-gradient method, which utilizes misfit gradients at current and previous iterations in order 
to determine the directionality and magnitude of the step to be taken. Preconditioning gradients in order to reduce 
the condition number and improve convergence characteristics is also a typically-employed procedure. We shall not 
describe these issues in any greater detail at present but merely note their imp ortance and that t hey need be addressed 
in any inverse procedure. For an in-depth discussion of these topics, see, e.g.. iTape et al.l ([2007D . 

An important aspect of the outcome of an inversion relates to uniqueness. Because this is an optimization problem, 
the solution may be trapp ed in a local minimum. One may attempt to avoid this pitfall by adopting the so -called 
multi-scale approach (e.g., iBunks et al.l 119951 : iSirgue fc Pratt! [2001 iRavaut et all [200l iFichtner et all [2001 which 
involves taking the following precautionary steps: 

• Choosing a "good" initial model is crucial since meaningless local optima may attract and trap the solution. In 
the case of sunspots, one may construct 3D models that are constrained by the surface field. 
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• Employ travel times of long- wavelength waves (i.e., high phase speeds) that are primarily sensitive to coarse- 
grained features of the object in question and iteratively refine the model by gradually incorporating travel times 
of smaller wavelength waves (lower phase speeds) . 

• Use different types of measurements, i.e., a variety of time-distance averaging geometries, frequency and phase- 
speed filters, in the misfit function. 

• Ensure a good match between simulations and observations at the photospheric level (e.g., photospheric sunspot 
magnetic fields or Doppler measurements of surface supergranulation). 

7. FLOW AND SOUND-SPEED KERNELS 

We u se the Seismic P r opaga tion through Active Regions and Convection (SPARC) code, developed by IHanasogd 
(|2007f k lHanasoge et al.l (|2008l ). A highly-ef ficient absorption met hod termed the convolutional perfectly- matched 
layer formulated for stratified environments (Hanasogc ct al. 2010) is applied at all boundaries. A domain of size 

250 x 250 x 35 Mm 3 is chosen, where the first two dimensions are horizontal and the third depth. The box straddles 
the photosphere, extending from 34 Mm below to 1 Mm above. The grid consists of 384 x 384 x 300 points, ensuring 
a horizontal resolution of 660 km. Vertical grid spacing decreases smoothly from about 250 km at the bottom of the 
box to around 27 km at the photosphere and above, so designed as to maintain constant acoustic travel time between 
adjacent pairs of points. 

In Figure [TJ power spectra of pre- and post-filtered intermediate wavcficlds (vertical component of rf) arc shown; we 
isolate the /-mode for this calculation. The predicted limit cross correlation, C Q( g (t) , obtained by filtering the forward 
wavefield and extracting the time series at the receiver is also shown. The positive and negative branches differ slightly 
in amplitude but show good phase agreement. 
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Fig. 1. — Normalized pre- and post-filtered power spectra of the z component of r\ on the upper panels and the /-mode dispersion relation 
u) = \/~gk overplotted (dark line; right). Below is shown the normalized predicted /-mode limit cross correlation C a p(t) of the wavefield 
measured at a pair of points with separation distance \a — /3\ = 1 Mm . Overplotted with the thin line is the "exact" cross correlation, 
estimated by inverse Fourier transforming the power spectrum (Eq. |E5| V The amplitudes of the negative and positive branches are slightly 
different but thei r phases match well. The dot-dash boxes around branches of the limit cross correlation denote the chosen temporal window 
(/(*) in Eq. EU). 



We display actual wavefields and demonstrate the process of computing kernels graphically in Figure [2] The first 
column shows snapshots of the intermediate wavefield r\ forced by a source at a at three time instants — this wavefield 
is filtered, time reversed and fed into the code as a source for the forward wavefield, «I> Q , seen in the second column. 
The adjoint source is computed using the predicted limit cross correlation that is derived from the forward wavefield 
and used to drive the adjoint wavefield & a p (where f3 is the receiver), depicted in reverse time in the third column at 
a number of instants. The final two columns show the interaction integral and stages in the construction of the partial 
kernel. This entire process must be repeated with f3 as source and a the receiver and its contribution must be added 
to the partial kernel obtained previously (shown in Figure [3]) . 
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Fig. 2. — Snapshots in time of various wavefields, interaction integral, and kernel zoomed in and out (akin to Figures 2 and 3 of 
ITromp ct al. 2010). The first three columns show vertical components of T7, # Q , ^'ag, and the next two display the interaction integral 

and horizontal flow kernel, X\, K^J with the final column showing a zoomed out picture of the kernel. The adjoint field a p ar >d spectral 
response wavefield r\ are shown in reverse time in order to highlight the computational algorithm: (1) we time reverse r\ and feed into the 
forward calculation and (2) the kernel is calculated via a convolution between the forward and adjoint. The two marks denote locations 
of source (left) and receiver. Total solar time of the spectral response calculation (77; left column) is 2.5 hrs, and forward and adjoint are 
5 hrs each. Note that the forward field (second column) is centered around the source point while the adjoint (third column) is centered 
around the receiver. 



Cuts through kernels are shown in Figure |3l The upper three panels show partial contributions and full kernels, at 
a depth z = —0.5 Mm; vertical cuts through the y — center- line for K Vm and K Vz are displayed on the fourth panel. 
Only the /-mode contributes to the kernel as evidenced by the consta ncy in sign of the ker nel as a function of depth. 
The x— and y— (anti-) symmetries of kernels are as expected (see e.g.. lBirch fc G izon 2003). The two faint horizontal 
lines seen at z — —0.2 Mm and z = 0.2 Mm in the vertical cuts of the x— and z— kernels correspond to the excitation 
depth and observation height respectively; the intermediate wavefield is computed with a source at the former depth 
and the cross-correlation and adjoint sources are injected at the latter height. The integral of the i-flow kernel may 
be directly estimated from the power spectrum — the two values agree to within a few percent (see appendix |E| . 

We also compute the sound-speed kern el for the mean travel-time measured using the p± ridge. Mean travel times 
are measured according to equation (|D10j) . The intermediate, forward, and adjoint simulations are performed and the 
interaction of the latter two is computed in accordance with equation (|69[) . The filtered power spectrum and limit cross 
correlation for the measurement are shown in Figure 2] The positive and negative branches are slightly phase shifted, 
suggesting the requirement of a larger computational domain. The kernel for this measurement is shown in Figure [5j 
The raypath corresponding to 10 Mm angular (horizontal) distance is also plotted for reference; note similarities to 
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Fig. 3. — Partial contributions to flow kernels (first two rows) and their sum (third row), displayed at a depth of z = —0.54 Mm. A 
vertical cut through the K Va . and K Vz kernels along the y = center line is shown on the fourth row. These are computed around 
the translati onally-invariant poly tropic background described in appendix [C] Symmetries and magnitudes of the kernels are in line with 
expectation (Birch & Gizon 2007). Note that there is an extra factor of time in the dimension of the flow kernels that arises from assimilating 
observed travel times into the kernels. 

sound-speed kernels computed bv lBirch et al.1 (|2004T l. 

8. CONCLUSIONS 

We present a general algorithm to compute sensitivity kernels and discuss its application in inversions for solar 
structure and dynamics. One may envisage using this method to compute kernels for the spherical Sun, for problems 
that require non-linear inverse techniques such as sunspots and supergranules, and perhaps move towards the greater 
goal of full-waveform inversion where properties derived from cross correlations, such as amplitude, may also be 
accounted for. Computational improvements, such as the ability to recover identical positive and negative branches 
of the cross correlation for a laterally-invariant background are required in order that amplitude information may be 
effectively used. Further, one must also consider the fundamental problem of convective instability in near-surface 
layers of models of solar stratification before this technique of computing kernels becomes directly applicable to the 
Sun. 
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APPENDIX 

SEISMIC RECIPROCITY 

We recall the helioseismic wave operator defined in equation ([3]). The operator may be split into two parts, Hermitian 
H and anti-Hermitian W, where the former satisfies the following relation 

f dx$ A --HS B = f dx$ B -H$ A . (Al) 

Jo Jo 
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Fig. 5. — Sound-speed kernel for a mean travel-time measurement using a pi mode. The antennae are separated by 10 Mm. The solid 
line denotes the ray path. A double-bounce p wave, whose ray travel time is approximately 15 minutes (and therefore likely in the temporal 
window), is also noticeable. 



The proof of the self-adjointncss of the ideal MHD equations with damping and no background flow is fa i rly in tricate 
and will not be repeated here. For a thorough demonstration, please refer to e.g.. iGoedbloed fc Poedt s (2004). The 
only anti-Hermitian part of equation ((3]) contains the background velocity term 



- 2iuj \ dx.£ A - (pv ■ V)£ B = 2iuj / dx £ B ■ (pv ■ V)£ A , 

Jq Jq 



(A2) 
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where the sign of the two integrals upon switching states A and B is reversed. Green's theorem (also Eq. [57]) tells us 

[(W-2iwv. V)G(x,x A )] ip = 5 ip 5(x-x A ). (A3) 
In order to demonstrate reciprocity, we consider another wave state due to a source B, whose Green's function is given 

by 

[{H + 2iuv V)G t (x,x s )] ig =S ig S(x-x B ). (A4) 

Now consider forming the following representation G qi (x, xb)x (|A3|) — G p ^ (x, x A ) X (|A4|) and integrating over all space. 
We have 

/ dxG^(x,x s ) [(W-2iwv V)G(x,x^)] ip -G pi (x,x A ) [(% + 2iwv • V)G^(x, x^)]^ 
Jq 

= / dx G^(x,x B ) S ip 5(x - Xyt) - Gpi(x,XA) S iq S(x - x B ). (A5) 

J© 

Equations (|A1|) and (|A2I) imply the left-hand side of (|A5|) is zero. Thus we arrive at the seismic reciprocity relation 
for helioseismic waves 

G f qp (x A ,x B ,uj) = G pq (x B ,x A ,uj), (A6) 

where G^ is Green's function for an identical wave operator, except with flows reversed in sign. The adjoint operator 
C) is therefore 

£ f £ = -uj 2 p£ - iujpT£ + 2iujpw ■ V£ - V{c 2 pV ■ £ + £ ■ Vp) - V • (p£)g 

- [VxBxVx(£xB) + {Vx[Vx(£xB)]}xB] . (A7) 

MAGNETIC FIELD KERNELS 
The perturbed magnetic operator is described by 

<5££ = -(Vx<5B)x[Vx(£xB)] - {Vx[Vx(£x6B)]}xB 

-(VxB)x[Vx(£x<5B)] - {Vx[Vx(£xB)]}x£B. (Bl) 

The variation in the misfit is given by 

<5Zi = i V / rfx / *t a p . ((Vx5B)x[Vx(# Q xB)] + {Vx[Vx(* Q X(5B)]}xB 



( V x B) x [ V x (* Q x SB)] + { V x [ V x (* a x B)] } x SB 



(B2) 



In order to free the SB from the confines of the differential curl operator, we make use of the following vector identities, 



a ■ (bxc) =c • (axb) = b • (cxa) 
a-Vxb = b- Vxa V - (axb), 



and the fact that 



dxV- (axb) = 0, 

Jq 

due to the homogeneous upper boundary conditions we employ. Taking the first term, we have 
& a p' (Vx<5B)xVx(* Q xB) =(VxSB) • |[Vx(* Q xB)]x* t ct/3 |, 

J dx(VxSB)- Vx($ a xB)x$t ajS =J dxSB- jvx[Vx(* a xB)x*t a/3 ] j, 



and now the second, 



{Vx[Vx(* Q X(5B)]}xB 



= {Vx[Vx(* a x<5B)]} • (Bx*»4 



f dx {Vx[Vx(# Q x<5B)]} • (Bx*t aj3 )= f dx [Vx(* Q x<SB)] • Vx(Bx#t a/J ), 

J© J0 

f dx [Vx(* Q x<5B)] • Vx(Bx* t Q(3 )= f dx (* Q x<SB) • Vx[Vx(Bx* t a/3 )], 

J© J© 

(* Q x<SB) ■ Vx[Vx(Bx* t « /3 )] = 5B- |vx[Vx(Bx* t Q/3 )]x* ct |, 



(B3) 
(B4) 

(B5) 



(B6) 
(B7) 

(B8) 
(B9) 
(BIO) 
(Bll) 
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followed by the third 



a/3 



(VxB)xVx(* Q x<5B) 



:Vx(* Q X(5B) 



^x(VxB) 



dx Vx(* Q X(5B) 



*t ajS x(VxB) 



^ dx (* Q x<SB) • |vx[*t^x(VxB)] 



and finally, the simplest of them all 



(* Q xflB)- |vx[*t Q/jX (VxB)]| = <5B. |vx[*%x(VxB)]x* Q | 
t of them all 

* f a/3 • |vx[Vx(* q xB)]x£bJ = £B • |*t Q/3X vx[Vx(* Q xB)] 



There are other terms that arise from perturbing the equilibrium equation (|70[) . These are 

-~ / dx(V-*t Q/3 )* a -[(Vx<5B)xB + (VxB)x<5B]. 

Expanding on the first, 

(V • #t Q/J ) $ Q . (Vx<5B)xB = (Vx<5B) • [Bx(*«V • *t Q/3 )] ; 

/ dx (Vxffi) ■ [Bx(* a V-*t^)]= f dxffi.{Vx[Bx(* a V'*y]}, 

J© J© 

and the second may be manipulated so 

(V ■ *t Q/J ) $ a . (VxB)x<5B = 5B ■ [V • *t^$ aX (VxB)]. 

BACKGROUND STRATIFICATION 
For the sub-surface layers, we use the following polytropic stratification prescription: 



P(z)=Ppoly 1 



z f 



m+1 



"I + 1 Ppply 

Zf Ppoly 



C(Z): 



' m+1 



-Ppoly 



Ppoly 



(B12) 

(B13) 
(B14) 

(B15) 

(B16) 

(B17) 
(B18) 

(B19) 



(CI) 



(C2) 

We set p po i y = 1.178 x 10 5 dynes cm" 2 ,/9 P oiy = 3.093 x 1CT 7 g cm~ 3 ,z/ = -0.450 Mm, and m = 2.150. Note also 
that z is the height, i.e., the atmospheric layers are described by z > and vice versa; z = is the fiducial surface. 

Waves propagating toward the surface in the Sun are reflected by a fluctuation in the background density gradient. 
This reflection zone is located at a height of z r ~ —0.050 Mm below the surface. We mimic this by attaching the 
above polytropic stratification with an overlying isothermal layer. This patching results in a fluctuation in the density 
gradient, leading to an acoustic cut-off frequency of approximately 5.4 mHz, similar to the solar value. For z < zm, 
we apply the above prescription. For z > z r , we use the following equations 

p(z)=p iso exp 



H 



Piso T^poly I 1 

1 z f 



m+1 



p(z) = p iso exp 



— I 1 r 

Piso Ppoly I -L 

' V z f 

Piso 



(C3) 



H- 
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The relations for pi so and p; so arise from the requirement of continuity of the pressure and density at the matching 
point between the polytropic and isothermal layers. The relation for H is a consequence of enforcing hydrostatic 
balance. This model is truncated at z = —34 Mm (lower boundary; polytrope) and z = +1 Mm (upper boundary; 
isothermal layer). 

UNITS OF WAVEFIELDS, CONVENTIONS, AND DEFINITIONS 
We apply the following Fourier transform convention 



1 

2^ 



dt e iut g{t)=g(u), 
dt e iut =2n S(u), 

— OO 

g(u) = g(t), 
dw e- %ult =2n S(t), 



dw e- tu)t 



(Dl) 
(D2) 
(D3) 
(D4) 



where the Fourier-transform pair g(t),g(ui) are written similarly for convenience. The equivalence between cross- 
correlations and convolutions in the Fourier and temporal domain are written so 



h(t) = / dt' /(f) g(t + <=► h(u) = /» g(uj), 



h(t) = / dt' /(f) g(t - t') h(u) = f(u) g(u). 



The following relationship also holds (for real functions f(t),g(t)) 

1 



dt f(t) g(t) 



2tt 



duj f*(u) g(w) 



2n 



(D5) 
(D6) 

(D7) 



We now describe the physical units of various quantities (indicated by square brackets around a given variable) 
p(x)] ee Mm" 3 (J dx <J(x) = 1) 



[6(t)] ee s- 1 
[£] ee g • Mm- 



(Jdt S(t)=l) 



(£ ~ p^) 



[GjEEs-g- 1 [CG = £(x - x')8(t - t')} 

[S] ee g • Mm" 3 ■ Mm • s~ 2 [££(x, t) = S(x, t)} 

[Q] ee g-1 [f Q dx' dt' 0(x, x', t - t') ■ S(x', t>) = cf>(x, t)} 

[F] ee Mm" 3 s" 2 [£,-(x, x", t)=j dt' dx' T(x', t') k G tf (x - x', x", i - f )] 

[T 7 ] ee g 2 ■ Mm" 3 • Mm 2 ■ s~ 3 [^-(x, w)5(x - x') = (S i (x,w)£?(x',a;))] 

[j7]EEg- Mm-^s" 2 [=/df C(x,x a ,t-f)-'P(x,t / )] 

[Mi] ee Mm- 5 • s 2 [= / dt' hb q W a0 (t' + t) T(x - x', t')] 

[*] ee Mm 2 [= J dt' dx' G(x, x', i - t') ■ tj(x', f )] 

[*t] = g -l . Mm- 2 • s 4 [= / dt' dx' G(x, x', t - t') ■ M{x' , t')} 

[K v ] ee Mm- 4 ■ s 3 [= ±fdt p[Vd t ${t)} • # f (-i)] 

We use equation (4) from IGizon fc Bir ch (2004) in order to define the weight function W a p(t) for the differential 
flow measurement 

W afj (t) = -C af3 (t) f{t) + f ^ t] - , , (D8) 



At£tf/(*0 4^') 



19 



where At is the temporal rate at which the cross correlations are sampled, f(t) is a window, and the difference travel 
time 5t is given by 



St = J dt W aP {t) 5C a0 {t). (D9) 

Note that since we compute difference travel times, W a p is an odd function of time whose Fourier transform is therefore 
purely imaginary. For the sound-speed kernel, we measure mean travel times, defined as 

W aP {t) = ~C aP {t) /W ~ / r ( 7 t) l2 . (D10) 



At£ t ,/(f) C aP (f) 



VALIDATION 

We perform validation tests in order to test the quality of computed kernels and limit cross correlations. 

Classical-tomographic sound-speed kernel 

As a simple test, we compute a single-source sound-speed kernel between a pair of points located 15 Mm apart. The 
source point is forced with the function shown in the upper- most panel of Figure [5] this calculation forms the forward 
wavefield. The seismogram at the receiver 15 Mm away is shown in the middle panel where the dot-dash lines denote 
the temporal window applied to isolate the first arrival. The adjoint source, the time-reversed windowed seismogram, 
is applied at the receiver. The kernel is subsequently calculated according to equation (j69| and is shown in Figure [3 

Consider the travel time of a ray propagating along path T 

"A (El) 
n c 

where s is length measured along raypath 1Z. Fermat assures us that the raypath is invariant under small perturbations 
of sound speed. Therefore the perturbation in travel time due to spatially constant 5c/ c is given by 

5r = -^(^ = -r^. (E2) 
c J r c c 

The sound-speed kernel must therefore satisfy (having divided out At), 

f 5c 2 5c 
5t= dx -^-KAx) -r— , (E3) 



or 



JdxK c ^x)^- T -. (E4) 

We find the integral of the kernel to be —192.88 s, which compares well with half the travel time, —190 s. 

Cross correlations 

The filtered cross correlation for a translationally-invariant background model may be written as 

C(A,w) - J dk |^)(k,a;)J"(k,u;)| 2 e ak - A , (E5) 

where A is the vector connecting two observation points. Thus we may estimate the cross correlation between 
a given pair of points by inverse- Fourier transforming the power spectrum. In Figures ([IJ and (Q, we compare 
computed and spectrally-estimated cross correlations and find some differences that likely arise from the finite-size 
of the horizontal domain and absorption boundary conditions that dissipate high-group-speed (low-frequency) waves 
which reach boundaries first. 

Flow-kernel Integral 

We introduce a spatially-uniform 0.1 km/s x-directed flow (i.e., everywhere in the domain) and the corresponding 
travel-time shift using equations (ID8[) and is — 9.6 s. The change in cross correlation for this background model 
is displayed in Figure |U The flow- kernel integral is 

5t = v x I dx K Vi (x) = 0.1 / dx K Vx (x) = -10.1 s . (E6) 
J© J © 

As expected, integrals of K Vy and K Vz are zero (jBirch fc Gizonl[200l . 
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Fig . 6, — Source-time function, receiver seismogram, and adjoint source involved in the computation of a single-source sound-speed kernel 
(e.g.,[Tromp ct al. 2005). Vertical and horizontal cuts are shown. The dot-dash lines in the seismogram show the temporal window applied 
to isolate the first arrival. 



Integral of the multiple- source sound-speed kernel 

We introduce a spatially uniform 1% perturbation to c 2 (i.e., everywhere in the domain) and the corresponding travel- 
time shift computed using equations (jDlOl) and ([24]) is — 1.98 s. The change in cross correlation for this slightly-altered 
background model is displayed in Figure [§1 

The expected travel-time shift for such a perturbation is given by the following integral 

5t= [ dx K c 2(x)^- = 0.01 I dx K c 2(x) = -1.75s, (E7) 
J c Jq 

where the kernel is displayed in Figure [5] 
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FlG. 7. — A single-source sound-speed kernel (e.g.,_Gizon & Birch 2002; Tromp ct al. 2005). Vertical and horizontal cuts arc shown. The 
symbols denote source (left) and receiver positions. The integral of the kernel is -192.88 s, compared to a half wave travel time of -190 s. 
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Fig. 8. — The upper panel displays cross correlations between a point pair 10 Mm apart, corresponding to backgroun d mo dels with 
flow (solid line) and a constant x— directed flow of magnitude 0.1 km/s. The related travel-time shift, computed using (ID8ft and 11241 . 
— 9.6 s, which implies a kernel integral of — 96 s. 
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Fig. 9. — The upper panel displays cross correlations between a point pair 10 Mm apart, corresponding to background models with 
sound-speed distributions of c 2 and 1.01c 2 . The two cross correlations fall almost on top of other and t heir difference, only visible on the 
lower panel, is on the order of a few percent. The related travel-time shift, computed using IIDlOl and (1241 . is — 1.98 s, implying a kernel 
integral of —198 s. 



